Predicting Emotion Labels for Chinese Microblog Texts

نویسندگان

  • Zheng Yuan
  • Matthew Purver
چکیده

We describe an experiment into detecting emotions in texts on the Chinese microblog service Sina Weibo (www.weibo.com) using distant supervision via various author-supplied emotion labels (emoticons and smilies). Existing word segmentation tools proved unreliable; better accuracy was achieved using characterbased features. Higher-order n-grams proved to be useful features. Accuracy varied according to label and emotion: while smilies are used more often, emoticons are more reliable. Happiness is the most accurately predicted emotion, with accuracies around 90% on both distant and gold-standard labels. This approach works well and achieves high accuracies for happiness and anger, while it is less effective for sadness, surprise, disgust and fear, which are also difficult for human annotators to detect.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Emotion Classification in Microblog Texts Using Class Sequential Rules

This paper studies the problem of emotion classification in microblog texts. Given a microblog text which consists of several sentences, we classify its emotion as anger, disgust, fear, happiness, like, sadness or surprise if available. Existing methods can be categorized as lexicon based methods or machine learning based methods. However, due to some intrinsic characteristics of the microblog ...

متن کامل

Emotion Corpus Construction Based on Selection from Hashtags

The availability of labelled corpus is of great importance for supervised learning in emotion classification tasks. Because it is time-consuming to manually label text, hashtags have been used as naturally annotated labels to obtain a large amount of labelled training data from microblog. However, natural hashtags contain too much noise for it to be used directly in learning algorithms. In this...

متن کامل

Microblog Emotion Classification by Computing Similarity in Text, Time, and Space

Most work in NLP analysing microblogs focuses on textual content thus neglecting temporal and spatial information. We present a new interdisciplinary method for emotion classification that combines linguistic, temporal, and spatial information into a single metric. We create a graph of labeled and unlabeled tweets that encodes the relations between neighboring tweets with respect to their emoti...

متن کامل

Towards Scalable Emotion Classification in Microblog Based on Noisy Training Data

The availability of labeled corpus is of great importance for emotion classification tasks. Because manual labeling is too timeconsuming, hashtags have been used as naturally annotated labels to obtain large amount of labeled training data from microblog. However, the inconsistency and noise in annotation can adversely affect the data quality and thus the performance when used to train a classi...

متن کامل

User Embedding for Scholarly Microblog Recommendation

Nowadays, many scholarly messages are posted on Chinese microblogs and more and more researchers tend to find scholarly information on microblogs. In order to exploit microblogging to benefit scientific research, we propose a scholarly microblog recommendation system in this study. It automatically collects and mines scholarly information from Chinese microblogs, and makes personalized recommen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015